16 research outputs found
MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction
Predicting protein properties such as solvent accessibility and secondary
structure from its primary amino acid sequence is an important task in
bioinformatics. Recently, a few deep learning models have surpassed the
traditional window based multilayer perceptron. Taking inspiration from the
image classification domain we propose a deep convolutional neural network
architecture, MUST-CNN, to predict protein properties. This architecture uses a
novel multilayer shift-and-stitch (MUST) technique to generate fully dense
per-position predictions on protein sequences. Our model is significantly
simpler than the state-of-the-art, yet achieves better results. By combining
MUST and the efficient convolution operation, we can consider far more
parameters while retaining very fast prediction speeds. We beat the
state-of-the-art performance on two large protein property prediction datasets.Comment: 8 pages ; 3 figures ; deep learning based sequence-sequence
prediction. in AAAI 201
Reevaluating Adversarial Examples in Natural Language
State-of-the-art attacks on NLP models lack a shared definition of a what
constitutes a successful attack. We distill ideas from past work into a unified
framework: a successful natural language adversarial example is a perturbation
that fools the model and follows some linguistic constraints. We then analyze
the outputs of two state-of-the-art synonym substitution attacks. We find that
their perturbations often do not preserve semantics, and 38% introduce
grammatical errors. Human surveys reveal that to successfully preserve
semantics, we need to significantly increase the minimum cosine similarities
between the embeddings of swapped words and between the sentence encodings of
original and perturbed sentences.With constraints adjusted to better preserve
semantics and grammaticality, the attack success rate drops by over 70
percentage points.Comment: 15 pages; 9 Tables; 5 Figure
Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers
Although various techniques have been proposed to generate adversarial
samples for white-box attacks on text, little attention has been paid to
black-box attacks, which are more realistic scenarios. In this paper, we
present a novel algorithm, DeepWordBug, to effectively generate small text
perturbations in a black-box setting that forces a deep-learning classifier to
misclassify a text input. We employ novel scoring strategies to identify the
critical tokens that, if modified, cause the classifier to make an incorrect
prediction. Simple character-level transformations are applied to the
highest-ranked tokens in order to minimize the edit distance of the
perturbation, yet change the original classification. We evaluated DeepWordBug
on eight real-world text datasets, including text classification, sentiment
analysis, and spam detection. We compare the result of DeepWordBug with two
baselines: Random (Black-box) and Gradient (White-box). Our experimental
results indicate that DeepWordBug reduces the prediction accuracy of current
state-of-the-art deep-learning models, including a decrease of 68\% on average
for a Word-LSTM model and 48\% on average for a Char-CNN model.Comment: This is an extended version of the 6page Workshop version appearing
in 1st Deep Learning and Security Workshop colocated with IEEE S&
Learning to Reason and Memorize with Self-Notes
Large language models have been shown to struggle with multi-step reasoning,
and do not retain previous reasoning steps for future use. We propose a simple
method for solving both of these problems by allowing the model to take
Self-Notes. Unlike recent chain-of-thought or scratchpad approaches, the model
can deviate from the input context at any time to explicitly think and write
down its thoughts. This allows the model to perform reasoning on the fly as it
reads the context and even integrate previous reasoning steps, thus enhancing
its memory with useful information and enabling multi-step reasoning.
Experiments across a wide variety of tasks demonstrate that our method can
outperform chain-of-thought and scratchpad methods by taking Self-Notes that
interleave the input text